_____________________________________________________________________________

Overview of the source QAQC file(s)

_____________________________________________________________________________


csv data file analyzed in this report:

## File located in folder: Connectivity-Networks/
file
LAGOSUS_NETSv1.0_MedRes_Metrics_Dams.csv


Structure of the QAQC file:

## 'data.frame':    86511 obs. of  22 variables:
##  $ lagoslakeid                  : chr  "8056" "19412" "7956" "68446" ...
##  $ nhdplusv2_comid              : chr  "142826768" "142806711" "142978929" "145050695" ...
##  $ lake_net_upstreamlake_km     : num  NA 2.02 4.37 NA NA ...
##  $ lake_net_downstreamlake_km   : num  NA 11.766 0.385 88.237 NA ...
##  $ lake_net_bidirectionallake_km: num  21.717 2.02 0.385 22.559 19.803 ...
##  $ lake_net_upstreamlake_n      : int  0 1 5 0 0 0 3 0 1 0 ...
##  $ lake_net_downstreamlake_n    : int  0 1 1 1 0 1 1 1 1 0 ...
##  $ lake_net_lakeorder           : int  1 2 4 0 2 0 2 0 1 0 ...
##  $ lake_net_lnn                 : int  1 2 4 1 1 1 5 1 2 1 ...
##  $ net_id                       : int  33 33 33 33 33 33 33 33 33 33 ...
##  $ net_lakes_n                  : int  1731 1731 1731 1731 1731 1731 1731 1731 1731 1731 ...
##  $ net_averagelakedistance_km   : num  132 132 132 132 132 ...
##  $ network_averagelakearea_ha   : num  127 127 127 127 127 ...
##  $ lake_net_nearestdamdown_km   : num  NA NA NA 118 0 ...
##  $ lake_net_nearestdamdown_id   : chr  NA NA NA "18622" ...
##  $ lake_net_totaldowndam_n      : int  NA NA NA 1 2 NA NA NA NA NA ...
##  $ lake_net_nearestdamup_km     : num  NA NA NA NA NA NA NA NA NA NA ...
##  $ lake_net_up_damid            : chr  NA NA NA NA ...
##  $ lake_net_totaldamup_n        : int  NA NA NA NA NA NA NA NA NA NA ...
##  $ lake_net_damonlake_flag      : chr  NA NA NA NA ...
##  $ lake_net_multidam_flag       : chr  NA NA NA NA ...
##  $ net_dams_n                   : int  286 286 286 286 286 286 286 286 286 286 ...


_____________________________________________________________________________

Checks on matching QAQC file and GIS layer

_____________________________________________________________________________


GIS layer used for comparison:

## File geodatabase is ../../../LAGOS_US_GIS_Data_v0.7.gdb
## 
## GDB layer is LAGOS_US_All_Lakes_1ha_points


Are there the same number of rows in the QAQC file and GIS layer?

## Number of lagoslakeids in the QAQC file is 86511
## Number of lagoslakeids in the GIS shapefile is 479950


lagoslakeids in the GIS layer that don’t match the QAQC file:

## 393439  lagoslakeids in the GIS gdb are NOT matched in the QAQC dataset


lagoslakeids in the QAQC file that don’t match the GIS File:

## All lagoslakeids in the QAQC dataset are matched in the GIS gdb


_____________________________________________________________________________

Checks on matching variable names in the qaqc file with those in the metadata file

_____________________________________________________________________________


Metadata file used for comparison:

## /Users/kathe/Dropbox/CL_HUB_DOC/Data_Dictionary/GEO_metric_metadata_WIP.xlsx


Variables in the QAQC file that didn’t match the metadata:


Metadata on QAQC variables that were matched:

variable_name variable_description units data_type taxonomy_type n
lagoslakeid Unique lake identifier developed by LAGOS-US NULL int key 1
lake_net_bidirectionallake_km Distance to the nearest lake upstream or downstream using bi-directional graph. kilometers numeric derived 1
lake_net_damonlake_flag A value of ‘Y’ indicates that there is at least one dam on this lake. NULL factor information 1
lake_net_downstreamlake_km Distance to nearest downstream lake using a unidirectional graph. kilometers numeric derived 1
lake_net_downstreamlake_n The number of lakes directly connected through streams downstream of a lake. number int derived 1
lake_net_lakeorder Lake order follows the Strahler stream order of the stream that flows from it (outflowing), where the higher order stream is chosen if more than one outlet occurs (Riera et al. 2000, Martin and Soranno 2006). The exception is that headwater lakes are 0. If the lake is a terminal lake, it will receive the order of the highest inflowing stream. NULL int derived 1
lake_net_lnn Lake network number (LNN) is the position of a lake within the network in reference to other lakes. The lake at the top of a network (i.e. no upstream lakes) will be 1, the next lake downstream will be 2, etc. If a lake has more than one lake upstream it will take the higher LNN. NULL int derived 1
lake_net_multidam_flag A value of ‘Y’ indicates that there are multiple dams on a lake. NULL factor information 1
lake_net_nearestdamdown_id The dam ID for the nearest downstream dam. Dam IDs are from the NABD dataset. NULL char information 1
lake_net_nearestdamup_km Distance to nearest upstream dam. kilometers numeric derived 1
lake_net_totaldamup_n The total number of dams upstream from a lake. number int derived 1
lake_net_totaldowndam_n The total number of dams downstream from a lake. number int derived 1
lake_net_upstreamlake_km Distance to nearest upstream lake using a unidirectional graph. kilometers numeric derived 1
lake_net_upstreamlake_n The number of upstream lakes directly connected through streams to a lake. number int derived 1
net_averagelakedistance_km Average distance between lakes in a network. kilometers numeric derived 1
net_dams_n The number of total dams in a network. number int derived 1
net_id The unique identifier assigned by LAGOS-NETS for each network NULL int derived 1
net_lakes_n The total number of lakes in the lake network. number int derived 1
nhdplusv2_comid Unique lake identifier from the nhd for the medium resolution NHDplusV2. NULL char key 1


_____________________________________________________________________________

Summary of lagoslakeid and any other categorical variables in the QAQC file

_____________________________________________________________________________

## Number of unique lagoslakeids = 86511


Data Frame Summary

qaqc_char

Dimensions: 86511 x 5
Duplicates: 0
No Variable Stats / Values Freqs (% of Valid) Graph Valid Missing
1 nhdplusv2_comid [character] 1. {00079e78-3ba6-4ca7-b607- 2. {00224290-8D37-4EF2-8E63- 3. {0065f460-ed3b-498c-9d66- 4. {00C1F714-3FDC-45E1-A94C- 5. {011A55F5-6AF2-402E-B333- 6. {012EE941-484D-4788-9BCB- 7. {016BE5BA-0E40-4A7D-9AA1- 8. {01BD24E9-B9FE-4401-B371- 9. {01D102A1-4B83-4E98-BEC7- 10. {01ECFAC0-DAEC-4284-A3D4- [ 86501 others ]
1(0.0%)
1(0.0%)
1(0.0%)
1(0.0%)
1(0.0%)
1(0.0%)
1(0.0%)
1(0.0%)
1(0.0%)
1(0.0%)
86501(100.0%)
86511 (100%) 0 (0%)
2 lake_net_nearestdamdown_id [character] 1. 11559 2. 38221 3. 99004 4. 2032 5. 49764 6. 17163 7. 24255 8. 24330 9. 48892 10. 17069 [ 30837 others ]
519(1.3%)
376(0.9%)
185(0.5%)
155(0.4%)
145(0.4%)
121(0.3%)
110(0.3%)
105(0.3%)
104(0.3%)
84(0.2%)
38197(95.2%)
40101 (46.35%) 46410 (53.65%)
3 lake_net_up_damid [character] 1. 2064 2. 47302 3. 1641 4. 17010 5. 332 6. 1580 7. 17067 8. 17069 9. 18517 10. 2122 [ 7117 others ]
6(0.1%)
5(0.1%)
4(0.1%)
4(0.1%)
4(0.1%)
3(0.0%)
3(0.0%)
3(0.0%)
3(0.0%)
3(0.0%)
7173(99.5%)
7211 (8.34%) 79300 (91.66%)
4 lake_net_damonlake_flag [character] 1. Y
12630(100.0%)
12630 (14.6%) 73881 (85.4%)
5 lake_net_multidam_flag [character] 1. Y
132(100.0%)
132 (0.15%) 86379 (99.85%)

Generated by summarytools 0.9.6 (R version 4.0.3)
2020-10-22


_____________________________________________________________________________

Summary of numeric variables in the QAQC file

_____________________________________________________________________________


Number of missing, zero, negative, and positive observations

qvar Missing Zero Negative Positive Total_n Unique
lake_net_bidirectionallake_km 0 137 0 86374 86511 23031
lake_net_downstreamlake_km 10655 73 0 75783 86511 49635
lake_net_downstreamlake_n 0 10655 0 75856 86511 14
lake_net_lakeorder 0 30861 1 55649 86511 11
lake_net_lnn 0 0 0 86511 86511 50
lake_net_nearestdamdown_km 46410 27417 0 12684 86511 12041
lake_net_nearestdamup_km 79300 520 0 6691 86511 4919
lake_net_totaldamup_n 79300 0 0 7211 86511 138
lake_net_totaldowndam_n 46410 0 0 40101 86511 33
lake_net_upstreamlake_km 63003 71 0 23437 86511 7147
lake_net_upstreamlake_n 0 63003 0 23508 86511 194
net_averagelakedistance_km 0 0 0 86511 86511 882
net_dams_n 1431 0 0 85080 86511 90
net_id 0 0 0 86511 86511 898
net_lakes_n 0 0 0 86511 86511 128
network_averagelakearea_ha 0 0 0 86511 86511 898


Summary statistics of numeric variables

qvar n mean sd median min max
lake_net_bidirectionallake_km 86511 8.39838 12.32958 4.3910 0.00000 284.717
lake_net_downstreamlake_km 86511 164.79802 345.34992 26.0075 0.00000 2412.315
lake_net_downstreamlake_n 86511 1.43624 1.66076 1.0000 0.00000 13.000
lake_net_lakeorder 86511 1.08798 1.14511 1.0000 -1.00000 9.000
lake_net_lnn 86511 1.64469 1.97461 1.0000 1.00000 50.000
lake_net_nearestdamdown_km 86511 37.04072 124.43498 0.0000 0.00000 1703.461
lake_net_nearestdamup_km 86511 7.89552 16.50401 2.6630 0.00000 269.745
lake_net_totaldamup_n 86511 9.59992 165.10662 1.0000 1.00000 6027.000
lake_net_totaldowndam_n 86511 2.82437 4.01491 1.0000 1.00000 32.000
lake_net_upstreamlake_km 86511 3.10160 7.85933 0.9870 0.00000 220.334
lake_net_upstreamlake_n 86511 1.43624 56.05566 0.0000 0.00000 7310.000
net_averagelakedistance_km 86511 756.51296 709.35417 350.5053 0.01500 1652.416
net_dams_n 86511 10030.85990 11857.00127 1176.0000 1.00000 24986.000
net_id 86511 43.57692 100.61235 9.0000 1.00000 898.000
net_lakes_n 86511 13254.64162 15306.75484 2397.0000 2.00000 32811.000
network_averagelakearea_ha 86511 79.69620 246.26399 80.4685 1.19014 47157.153


_____________________________________________________________________________

Check on sums of percent composition variables

_____________________________________________________________________________

## Check not relevant to this dataset


_____________________________________________________________________________

Check on zonal data completeness

_____________________________________________________________________________


Summary of zones with datacoveragepct between 0 and 100 (Incomplete) or = 0 (Zero)

## Check not relevant to this dataset


_____________________________________________________________________________

Checks on missing values

_____________________________________________________________________________


Summary and maps of missing character values

## 160260 character variable observations have missing values
## 
## NOTE: table not printed if there are > 100 missing values



Summary and maps of missing numeric values

## 326509 numeric variable observations have missing values
## 
## NOTE: table not printed if there are > 100 missing values



_____________________________________________________________________________

Spatial patterns of (selected) QAQC character variables

_____________________________________________________________________________


## 100% of the data are plotted for LAGOSUS_NETSv1.0_MedRes_Metrics_Dams
## Points are lake polygon centroids
## Note: When character variables have nearly unique values (e.g., names), maps are not created


_____________________________________________________________________________

Spatial patterns of numeric variables

_____________________________________________________________________________


## 100% of the data are plotted for LAGOSUS_NETSv1.0_MedRes_Metrics_Dams
## Points are lake polygon centroids


_____________________________________________________________________________

Overview of results of QAQC checks

_____________________________________________________________________________